Skip to content

Escape control characters in JSON output #20089

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Dec 23, 2014
Merged

Escape control characters in JSON output #20089

merged 4 commits into from
Dec 23, 2014

Conversation

rolftimmermans
Copy link
Contributor

The JSON spec (http://www.json.org) says that control characters are not allowed in JSON, but Rust currently does not escape them. This PR escapes ASCII control characters in JSON output.

@rust-highfive
Copy link
Contributor

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @erickt (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. The way Github handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see CONTRIBUTING.md for more information.

b'\n' => "\\n".into_cow(),
b'\r' => "\\r".into_cow(),
b'\t' => "\\t".into_cow(),
b'\x00'...b'\x1f' | b'\x7f' => format!("\\u00{:0>2x}", *byte).into_cow(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! Unfortunately this unfortunately would be pretty expensive. Could you rewrite this to not use an allocation here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I could make the match exhaustive with static strings, but it would be pretty long (32 lines). Is that acceptable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that would be. You could try something like:

    for (i, byte) in bytes.iter().enumerate() {
        let escaped = match *byte {
            b'"' => "\\\"",
            b'\\' => "\\\\",
            b'\x08' => "\\b",
            b'\x0c' => "\\f",
            b'\n' => "\\n",
            b'\r' => "\\r",
            b'\t' => "\\t",
            b'\x00'...b'\x1f' | b'\x7f' => "\\u00",
            _ => { continue; }
        };

        if start < i {
            try!(wr.write(bytes[start..i]));
        }

        try!(wr.write_str(escaped));

        match *byte {
            b'\x00'...b'\x1f' | b'\x7f' => try!(write!(wr, "{:0>2x}", *byte)),
            _ => {}
        }

        start = i + 1;
    }

Not perfect, but it at least doesn't force an allocation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that looks good. Problem though is that some control characters are already in the list so we'd need a pattern like b'\x00'...b'\x07' | b'\x0b' | b'\x0e'...b'\x1f' | b'\x7f'. Perhaps that's getting a little complex. I've now updated it with static strings only – let me know which one you prefer.

@erickt
Copy link
Contributor

erickt commented Dec 21, 2014

Looks good! Can you add yourself to the AUTHORS.txt? I'll r+ it afterwards.

@rolftimmermans
Copy link
Contributor Author

Ok, done. Thanks!

@alexcrichton
Copy link
Member

Needs a rebase

@rolftimmermans
Copy link
Contributor Author

Rebased

alexcrichton added a commit to alexcrichton/rust that referenced this pull request Dec 22, 2014
@bors bors merged commit 0a7ef3f into rust-lang:master Dec 23, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants